Crosslinguistic Transfer in Automatic Verb Classi cation

نویسندگان

  • Vivian Tsang
  • Suzanne Stevenson
  • Paola Merlo
چکیده

We investigate the use of multilingual data in the automatic classiication of English verbs, and show that there is a useful transfer of information across languages. Speciically, we experiment with three lexical semantic classes of En-glish verbs. We collect statistical features over a sample of English verbs from each of the classes, as well as over Chinese translations of those verbs. We use the English and Chinese data, alone and in combination, as training data for a machine learning algorithm whose output is an automatic verb classiier. We demonstrate that Chinese data is indeed useful in helping to classify the English verbs (at 82% accuracy), and furthermore that a multilingual combination of data outperforms the English data alone (85% accuracy). Moreover, our results using monolin-gual corpora show that it is not necessary to use a parallel corpus to extract the translations in order for this technique to be successful.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Pr oc ee di ng s of th e 19 th C O L IN G , 1 02 3 - 10 29 , 2 00 2 . Crosslinguistic Transfer in Automatic Verb Classi cationVivian

We investigate the use of multilingual data in the automatic classiication of English verbs, and show that there is a useful transfer of information across languages. Speciically, we experiment with three lexical semantic classes of En-glish verbs. We collect statistical features over a sample of English verbs from each of the classes, as well as over Chinese translations of those verbs. We use...

متن کامل

Automatic verb classification using multilingual resources

We propose the use of multilingual corpora in the automatic classi cation of verbs. We extend the work of (Merlo and Stevenson, 2001), in which statistics over simple syntactic features extracted from textual corpora were used to train an automatic classi er for three lexical semantic classes of English verbs. We hypothesize that some lexical semantic features that are di cult to detect super c...

متن کامل

Automatic Verb Classi cation Using Multilingual Resources

We propose the use of multilingual corpora in the automatic classiication of verbs. We extend the work of (Merlo and Stevenson, 2001), in which statistics over simple syntactic features extracted from textual corpora were used to train an automatic classiier for three lexical semantic classes of English verbs. We hypothesize that some lexical semantic features that are diicult to detect superrc...

متن کامل

Automatic Lexical Acquisition Based on Statistical Distributions

We automatically classify verbs into lexical semantic classes, based on distributions of indicators of verb alternations, extracted from a very large annotated corpus. We address a problem which is particularly di cult because the verb classes, although semantically di erent, show similar surface syntactic behavior. Five grammatical features are su cient to reduce error rate by more than 50% ov...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002